Discovering Cyclic Causal Models in Psychological Reesearch

Inferring Causal Relations from Observational Data

Kyuri Park

May 22, 2023

Network Theory in Psychology

Mental disorder is produced by direct causal interactions between symptoms that reinforce each other via feedback loops. (Borsboom & Cramer, 2013).

Inferring causality from observational data

Directed Acyclic Graph (DAG)

Directed Cyclic Graph (DCG)



Problem: Estimating a cyclic causal model is fundamentally very difficult. Relaxing the acyclicity assumption entails much of theoretical complication.

Solutions exist

Performance Evaluation

Summary

  • Causal inference is the fundamental interest in science.

  • The underlying dynamic processes of many systems contain cycles.

  • Our study showcases the cyclic causal discovery algorithms that are potentially suitable for typical psychological observational data.

  • Causal discovery methods could provide much richer insights into the underlying causal dynamics of the system than statistical network models.

  • Conclusion is rather nuanced (no one-size-fits-all algorithm).

  • Learning cyclic causal models from observational data is challenging.

References


Borsboom, D., & Cramer, A. O. J. (2013). Network analysis: An integrative approach to the structure of psychopathology. Annual Review of Clinical Psychology, 9(1), 91–121. https://doi.org/10.1146/annurev-clinpsy-050212-185608
M. Mooij, J., & Claassen, T. (2020). Constraint-based causal discovery using partial ancestral graphs in the presence of cycles. In J. Peters & D. Sontag (Eds.), Proceedings of the 36th conference on uncertainty in artificial intelligence (UAI) (Vol. 124, pp. 1159–1168). PMLR. https://proceedings.mlr.press/v124/m-mooij20a.html
Richardson, T. (1996). Discovering cyclic causal structure. Carnegie Mellon [Department of Philosophy].
Strobl, E. V. (2019). A constraint-based algorithm for causal discovery with cycles, latent variables and selection bias. International Journal of Data Science and Analytics, 8(1), 33–56. https://doi.org/10.1007/s41060-018-0158-2

Theoretical complication

  • Global Markov property is no longer guaranteed.

    • Need extra restrictions on \(P\) (e.g., linearly independent error terms (\(\varepsilon\)))
  • Cyclic model is not always statistically identified (even in linear case).

    • many equivalent models !
  • Equilibrium state is necessary (All \(|\lambda| < 1\)).

Solutions exist, but…

Project goals


  1. Give an accessible overview of the algorithms.
  2. Investigate the performance of each algorithm.
  3. Apply to empirical data to assess practical applicability.

Simulation settings

Constraint-based algorithm output

Performance evaluation

Evaluation metrics

  • Structural Hamming distance (SHD) = \(A\) (Addition) + \(D\) (Deletion) + \(C\) (Changes)
  • Precision = \(\frac{TP} {(TP + FP)}\)
  • Recall = \(\frac{TP} {(TP + FN)}\)
  • Uncertainty rate = \(\frac{\text{Number of circle endpoints} (\circ)}{\text{Total number of possible endpoints}}\)

  • Example:

    • For the arrow head (\(>\)): precision = \(\frac{4}{4 + 3 + 0}\) and recall = \(\frac{4}{4 + 0 + 0}\).

    • The value of SHD for the example PAG output from (b) – provided that the true ancestral graph is (a) – is 6: 0 (A) + 0 (D) + 6 (C).

Typical behavior of algorithms

(a): True ancestral graph
(b): PAG estimated by CCD
(c): PAG estimated by FCI
(d): PAG estimated by CCI

Empirical example

Possible practical application

  • Personalized psychotherapy (target symptoms)

  • Medical: effective treatment design

Follow-up

Possible combination with different types of causal discovery algorithm. \(\rightarrow\) Hybrid!

CCD+ GES (greedy equivalence search)

Thank you